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Abstract 

Robustness is an important property of complex networks. Up to now, there are 
plentiful researches focusing on the network's robustness containing error and attack 
tolerance of network's connectivity and the shortest path. In this paper, the error and 
attack tolerance of network's community structure are studies through randomly and 
purposely disturbing interaction of networks. Two purposely perturbation methods are 
designed, that one methods is based on cluster coefficient and the other is attacking 
triangle. Dissimilarity function D is used to quantify the changes of community struc- 
ture and modularity Q is used to quantify the significance of community structure. 
The numerical results show that after perturbation, network's community structure is 
damaged to be more unclear. It is also discovered that purposely attacking damages 
more to the community structure than randomly attacking. 
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1 Introduction 

In recent years, more and more systems in many fields are depicted as complex networks. 
Robustness, as an important function of many systems, is studies under framework of com- 
plex networks recently. The past researches mainly focus on the the topological aspects of 
robustness and are usually done either by removal of nodes, random failure of nodes or tar- 
geted attack, or by replacing a sector of edges or adding new edges [1]. Network with special 
structure, for instance scale-free network, is found to be less robust to targeted attack and 
more robust to random failure. The robustness is usually measured by either the size of 
the largest connected component or the length of the shortest path between pairs of nodes. 
Some of the investigations try to find effective way to improve robustness of real networks 
against attack. 
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During the studies on networks, lots of evidences show that there are communities in 
social networks, metabolic networks, economic networks[2l[3l|4j[5l|6l[7] and so on. Commu- 
nity structure is an important character to understand the functional properties of complex 
networks. For instance, in social networks, communities can formed depending on careers or 
ages. In food web, communities may reveal the subsystem of ecosystem [S]. In biochemical 
or neural networks, communities may correspond to functional groups [5]. In the world wide 
web, the community analysis have found thematic groups [9l [1^. Email network can be 
divided into departmental groups whose work is distinct and the communities reveal orga- 
nization structures [111 I12j . Moreover, the study of dynamics in complex networks shows 
that vertices belonged to the same community reach synchronization easily |13) . Thus deep 
understanding on community structures will make us comprehend and analyze the charac- 
teristics of systems better. Most research about the community structure mainly focus on 
the algorithm of detecting the community structure in networks [2l [14] . 

Robustness of community structure hasn't been paid enough attention. Recently, Brian 
Karrer et al. [ISjproposed a random perturbation method with rewiring edges randomly, 
and studies the robustness of community structure, which is an error tolerance study. In |15) , 
it is proposed that robustness study can be another method of measuring the significance 
of community structure. The attack tolerance of community structure hasn't been studied 
yet, which is mainly focused in this paper. Based on the idea in |15j . we propose two 
targeted methods and investigated their effect on community structure in networks. This 
paper is organized as following. In section [21 we introduce three network perturbation 
ways, including one random disturbing way. In section [3l random and attack tolerance 
of community structure are investigated through dissimilarity functions D. We disturb 
networks using targeted attack and random attack method on artificial and real network and 
numerical results are analyzed. Section [31 attack's effect on the significance of community 
structure and network's topology character is explored through modularity Q's and average 
cluster coefficient C. It is found that targeted attack cause serious damage to network's 
structure. The attack is simulated both on artificial and real network and numerical results 
are analyzed. Section [5l is a conclusion. 



2 Attack Methods on network 

In network, triangles as an basic unit plays important role in communities structure. There 
are divisive algorithms basing on the triangle structure to find community structure in 
networks, such as the algorithm invited by Filippo Radicchi, et al. |16j and good results are 
obtained. It is found that edges connecting nodes in different communities are included in 
few or no triangles. Edge-clustering coefficient is introduced first (TB] for the edge-connecting 
node i to node j: 

n3 = Z12 (-{) 

mm[(fc, -l),(fcj-l)] 

where zfj is the number of triangles built on that edge and •min[{ki — 1), {kj — 1)] is the 
minimal possible number of them. Edges connecting nodes within communities are included 
in more triangles and tend to have large values of Cfj . Our first targeted attack method is 
attacking edges with the largest Cf , and second targeted attack method is attacking edges 
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which is included in the largest triangles. Initially a network with N nodes and M edges is 
given. 

The targeted attack method based on edge-clustering coefficient ( EC method in short) 
is as followings. 

1. Compute Cfj on every edge and ascending the edge according to the cluster coefficient. 

2. Remove the former a*M edges from the network. 

3. Go through every pair nodes and rewire a*M edges randomly selected from the 
network. 

It can be known that no edge is moved as a = 0. As a = 1, all edges are moved. The 
number of nodes and edges keeps constant ever if moved. 

The targeted attack method based on triangles ( T method in short), is as followings. 

1. Compute zfjOn every edge and ascending the edge according to Zij. 

2. calculate the total number of t — zf^ ^ 0. and remove the former ai * t edges from 
the network. 

3. Randomly add ai * t edges to the network. 

In T method, t represents the total number of edges included in triangles. It can be known 
that no edge is moved as ai — 0. As ai — 1, t edges are moved. For comparison to other 
perturbation method, here we still use a to represent the ratio of number of edges actually 
moved to the total number of edges, a = ^jj^- The number of nodes and edges keeps 
constant ever if moved. 

The random attack methods ( R method in short) is designed here for comparison to the 
targeted attack. 

1 . remove a * M edges from the network randomly. 

2. Randomly add a * M edges to the network. 

. The main difference between above two targeted attack methods is that the second methods 
disturbing the only edges contained in triangles purposely rather than all of the edges in 
the network. Thus the radices of the perturbation are changed. In the second methods if 
a = 0, no edges are rewired. However, if a = 1, all of the edges that consist of triangles are 
rewired, rather than all of the edges in the network. During the perturbation the average 
degree and size N of the network keep constant. The attack methods which is different 
from the method in |15| . is that our perturbation scheme generates networks in which the 
expected degrees of vertices aren't the same as the original degrees. 
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3 Robustness of Community Structure 



3.1 Quantification of variation of Community Structure 

Quantifying difference of community structures is not a new Droblem [17[ 1181 119| . Tfie com- 
munity structure divided by algoritlim is usually compared with the actual communities if 
known. When analyzing the precision of a certain algorithm, the comparison of different 
community structures is also needed. Thus several methods for measuring similarities or 
differences between divisions of community structure have been designed. Function D [20] is 
a simple and efhcient measurement to the difference between two community division and 
is chosen here as an measurement of robustness of community structure. 

In this section, based on the three perturbation methods proposed above, we measure 
the variation of robustness of community structure after perturbation using dissimilarity 
function D. While in Karrer's study, he analyzed the robustness using function V based 
information entropy TS! . The reason we choose D is that D reflects the same character of 
variation of information of community structure as V does, and D can be normalized to 1. 
When doing the computer experiments, it is found that D is more sensitive than V. 

The idea of dissimilarity function D is introduced by [SD]. Discuss the similarity and 
dissimilarity of two sets A and B that defined as the subsets of fi. Similarity is expressed by 
An B, and dissimilarity corresponds to be (A n -B) U (A n S). The normalized similarity 
and dissimilarity can be represented as 

f \AnB\ 

* \AllB\ 

(2) 

\{AnB)u{AnB)\ 

K ^ \AUB\ 



Discussing two particular divisions of a network, each of them have many communities, 
and we assuming that both of them have k communities. First, construct the correspon- 
dence between communities that from different sets, which makes them have biggest simi- 
larity. Second, calculate the dissimilarity of each pair of the subsets. And then using the 
dissimilarities of all subsets to calculate the integer sets' dissimilarity. 



D = iLl^ (3) 



However, in most cases, two community structures do not have the same number of 
communities, which means not every subset has correspondence subset. To solve this prob- 
lem, the subset Xi that has no correspondence, correspond with $. The k equal to the larger 
number of communities. Under this definition, the maximum value of Z? is 1 and minimum 
value of D is 0, where (0, 1) means no and largest differences respectively. 



3.2 Numerical results of Robustness of Community Structure 

For a given network, now we have all the components to analyze the robustness. First, we 
get the community structure C of this network by any algorithm existing now. Here we 
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Figure 1: Value of Function D versus a. Three attack methods are simulated on the arbitrary 
network benchmark, {a,)kout = 2 on GN benchmark; (b)/coMi = 10 on GN benchmark;(c) LFR 
benchmark, N = 500, A:=10, p{k) ^ k^^ , 7 — 2.5, p{s) ~ fc~^, A = 2.0, mixing parameter 
fi = 0.3. 
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Figure 2: Value of Function D versus a. Three attack methods are simulated on real 
network .(a) Three attack methods are simulated on Karate club network, (b) Three attack 
methods are simulated on football network. (c)Three attack methods are simulated on 
econo-physicist network. 

use EO algorithm to detect the community structure for its good character [21j . Then, we 
disturb the original network with the methods proposed in above section separately, and get 
the new community structure C . And then, measure the varieties of community structure 
by function D which have introduced in section [2] Since the disturbing methods including 
some choosing by random, we will repeat the second step for some times, and get the average 
value of variation to make sure it not be impacted by special cases. And the whole process 
is done many times concomitantly with the change of parameter a from to 1. 

To test the efficiency of targeted attack methods proposed here, three attack methods 
are simulated respectively on artificial networks generated by GN benchmark and LFR 
benchmark, and real network including Karate network, football network, and econophysicist 
collaboration network. 

As first, we apply the methods on benchmarks of Girvan-Newman (GN) proposed in [^ . 
In GN benchmark, homogeneous networks are generated and used widely in the evaluation 
of community detection algorithms. These networks consist of 128 vertices divided into 4 
communities of 32 nodes each. Every node is connected on average with (kin) nodes of its 
own group and (kout) of the rest of the network. The total degree of each node is equal to 
k = kin + kout and always kept constant to 16. As the average number kout of between-group 
connections per vertex is increased from zero, the community structure in the network, stark 
at first, becomes gradually obscured until, at the point where between- and within-group 
edges are equally likely, the network becomes a standard Poisson random graph with no 
community structure at all. Here, two cases that kout= 2 and kout = 10 are used. 
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FiglIJa,b) shows the results of the apphcation of our analysis method to graphs of this 
type. The figure shows the value of the variation of community structure D as a function 
of the parameter a that measures the amount of perturbation. For small value of kout that 
kout = 2 the variation of community structure increases faster under two targeted attacks 
than under random attack as a function of a as shown in FiglIJa). As we can see, the 
variation of community structure D starts at zero when a =0, as we would expect for an 
unperturbed network, rises rapidly, then levels off as a approaches its maximum value of 1. 
The curves of targeted methods depart significantly from that of random method, indicating 
that the community structure discovered by the algorithm is less robust against targeted 
perturbation. Furthermore, the curve of ET method depart significantly from that of T 
method, indicating that the community structure discovered by the algorithm is relatively 
fragile against the ET method. 

We can find that the curves represent the three methods don't have the same length. 
The reason is the total edges that are changed in different methods are not the same. For R 
method ET method, all of the edges in the network can be moved. For T method disturbing 
towards triangles, the edges that can be disturbed are the edges that forming triangles, 
number of which may be much fewer than the total number of the edges. 

Large values of kout in GN benchmark that generates network with obscure community 
structure, and kout = 10 here. As shown in FigHIb), D keeps a high value and almost 
unchanged as a changes from zero to one under three attack methods, indicating that the 
attack tolerance and error tolerance of obscure community structure is closed to each other. 
It is necessary to point out that D^s minimal value is not always exact zero when a = 0, 
which is determined by the EO algorithm. 

Then, on benchmarks of Lancichinetti et al. (LFR) in [33], the methods is applied to 
disturb the network generated. LFR is a generalization of the GN benchmark to hetero- 
geneous group sizes and graph degree distribution. Groups are also a priori fixed with the 
degrees and the community sizes following a power-like distribution. As before, nodes have 
kin connections within its own group and kout edges linking elsewhere. For investigation of 
robustness of community structure, networks with significant community structure is needed 
here, and the parameters of LFR benchmark is set as following. The average degree (k) — 
10, and size is 500. The degree distribution follows a power-law P{k) ^ k'' , with 7= 2.5; and 
the community sizes distribution follows a power- law P{k) - k^, with /3= 2. The mixing 
parameter /i — kout/k indicates the "strength" of the communities, and is set to be 0.3 here. 
The maximal community size is 80, and the minimal is 20. Under such parameters, the 
network with clear community structure can be generated. 

For the LFR benchmark, as shown in Figlljc), the curves of targeted methods also 
depart significantly from that of random method, indicating that the community structure 
discovered by the algorithm is fragile against targeted perturbation. With moving a small 
amount of edges with high C or high number of triangles, the community structure changes 
more than with random moving edges. The results on the three artificial networks suggest 
that targeted methods is more efficiently destroying the community structure than random 
attack. 

Turning now to real-world networks, we have tested our method on examples mainly 
including social networks. A selection of results are shown in Figl2] 

FiglUa) shows the curve of variation of community structure as a function of a for one 
of the best studied examples of community structure in a social network, the karate club 
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network of Zachary [53]. The vertices in this network represent members of a karate club 
at a US university in the 1970s and the edges represent friendship between members based 
on independent observations by the experimenter. The network is widely believed to show 
strong community structure and repeated studies have upheld this view. 

The black(square) and blue(triangle) points in the figure show the variation of community 
structure under targeted attacks while the red points show the results under random attack. 
It is clear in this case that the community structure is essentially the same robust against 
random perturbation with targeted perturbation. 

Then we apply the methods to the football network |2S] are shown in Figl^Jb). In the 
network of American college football teams, is a representation of the schedule of Division I 
games for the 2000 season: vertices in the graph represent teams (identified by their college 
names) and edges represent regular-season games between the two teams they connect. 
The network contains 115 nodes, 613 edges and is proved to have significant community 
structure. The curves of targeted methods also depart significantly from that of random 
method, indicating that the community structure discovered by the algorithm is relatively 
less robust against targeted perturbation. And for a certain possibility that edges have been 
disturbed, the variation of community structure caused by the methods we introduced are 
larger than variation of community structure caused by the random disturbing. 

The result of robustness of econophysicists collaboration network [26] are shown in 
FigEj^c). In the econophysicists collaboration network, notes represent econophysicists, the 
edges represent their collaboration relationship. And we analyze the largest component of it, 
which contains 271 nodes. For the same reason that different methods disturbing different 
edges, the curves in Fig(2{c) are not in the same length. The curves of FT method departs 
significantly from that of R method, indicating that the community structure discovered by 
the algorithm is relatively fragile robust against FT perturbation. 

Comparing the analysis in the three real networks, we can find that the disarrange meth- 
ods we proposed are always more efficiency than random disturbing. The results shows that 
under the same perturbation strength, targeted attack methods is more efficient than random 
attack method. Targeted methods always cause more damage to the original community 
structure. These results indicate that when investigating robustness of community struc- 
ture, targeted attacks based on edge-clustering coefficient or triangles can be more efficient 
methods. 

4 Modularity and Edge-clustering coefficient 

In above sections, the robustness of community structure is discussed through targeted and 
perturbation on network, and it is found that the community structure is fragile to targeted 
attack. In this paper, we mainly focus on the perturbation's effect on the topology character 
of network, such as modularity function and edge-clustering coefficient, which are related to 
community structure. 

4.1 Modularity function 

Modularity function Q [27| now is implemented widely to measure the significance of net- 
work's community structure. 
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where Aij is an element of the adjacency matrix of the network, Aij = 1 if there is an 
edge between node i and node j, otherwise Aij =0. ki represents the degree of the vertex 
i, which is defined to be the number of edges connected to node i, m = \ Aij is the 
number of edges in the whole graph, and Ci shows that vertex i belongs to community Ci. If 
the community structure is divided properly, the faction of edges within communities should 
be large than the expect for the randomized network. The larger value of the function Q 
means the better community structure. For a given process people can calculate Q for each 
split of a network into communities, and there are only one or two local peaks. The position 
of these peaks usually correspond closely to the expected divisions. 

When using EO algorithm, modularity analysis is necessary to find which partition is 
the best one. The optimal modularity for each networks, including original one and the 
disturbed networks, can be calculated. By doing this, we can measure whether the best 
community structure partition for the disturbed network is clear. Fig|3]and Fig|3]show the 
modularity changing on above three artificial and three real networks, caused by the three 
different methods introduced above. We can find that modularity become smaller with the 
disturbing strengthen going up except the GN benchmark when kout — 10. The modularity 
of this artificial network is low and keeps almost unchanged under targeted attack and 
random attack. Meanwhile, the curves of targeted methods depart significantly from that of 
random method, indicating that targeted attacks make the community structure discovered 
by the algorithm more obscure than random attack. 
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Figure 3: Value of Modularity Q versus a. Three attack methods are simulated on the 
arbitrary network benchmark. (a) fcout = 2 on GN benchmark;(b)A:oiit = 10 on GN bench- 
mark;(c) LFR benchmark, N = 500, fc=10, p{k) ~ fc"^, 7 = 2.5, p{s) - k-^, A = 2.0, 
mixing parameter fj, = 0.3. 



4.2 Edge-clustering coefficient 

In section [2l it has been mentioned that edge-clustering coefficient is a measurement of the 
strength of the network's connection, and network with significant community structure 
usually has large edge-clustering coefficient value. In this part, we mainly investigate the 
variation of average edge-clustering coefficient C to perturbation strength a. It is found that 
in most cases C decreases fast under targeted methods with a's increasing from zero as shown 
in Figl5]and FiglB) Meanwhile, interesting phenomena appears that on GN benchmark when 
kout =10 and football network, that C decreases at the beginning and increases later. 
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Figure 4: Value of Modularity Q versus a. Three attack methods are simulated on real 
network, Q decreases with a, which means perturbation makes community structure unclear, 
(a) Three attack methods are simulated on Karate club network, (b) Three attack methods 
are simulated on football network, (c) Three attack methods are simulated on econo-physicist 
network. 



In this part, through modularity and edge-clustering coefficient, the results suggest that 
targeted attack method caused more damage to topology of networks, which is relating to 
community structure. 




Figure 5: Value of edge-clustering coefficient C versus a. Three attack methods are simu- 
lated on the arbitrary network benchmark, (a)fcotit = 2 on GN benchmark; (b)/coMi = 10 on 
GN benchmark;(c) LFR benchmark, N = 500, fc=10, p{k) ~ /c"^, 7 = 2.5, p{s) ~ fc"^, A = 
2.0, mixing parameter /i = 0.3. 



5 conclusion 



In the conclusion, we propose targeted methods to disturb the network, both of which are 
different from random disturbing. And then we use the random disturbing method and the 
our two methods to disturbing the original network and using function D to compare the 
variation of community structure, which is more convenient than V . The results show that 
targeted attack methods based on edge-clustering coefficient and triangles damage more to 
community structure through analysis of Z), Q, C than random disturbing. These facts 
indicate that community structure is fragile against targeted attack and imply that triangle 
is important and deserves more attention in the study of community structure. 
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Figure 6: Value of edge-clustering coefficient C versus a. Three attack methods are sim- 
ulated on real network, (a) Three attack methods are simulated on Karate club network, 
(b) Three attack methods are simulated on football network, (c) Three attack methods are 
simulated on econophysicist network. 
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